Erratum To 'A Statistical Approach To Machine Translation'

نویسندگان

Peter F. Brown

Stephen A. Della Pietra

Frederick Jelinek

Robert L. Mercer

John Cocke

Vincent J. Della Pietra

John D. Lafferty

Paul S. Roossin

چکیده

In Section 6 of "A statistical approach to machine translation" (Computational Linguistics 16(2), 79-85), we reported the results of two experiments in which we estimated parameters of a statistical model of translation from English to French. In the first experiment, the English and French vocabularies each consisted of 9,000 common words, and the model parameters were estimated from 40,000 pairs of sentences 25 words or less in length. Words outside the 9,000-word vocabularies in these sentences were mapped to special unknown words. In the second experiment, the vocabularies were limited to 1,000 common English words and 1,700 common French words, and the model parameters were estimated from 117,000 pairs of sentences 10 words or less in length that were completely covered by the respective vocabularies. In Figures 4, 5, and 6 of the paper, we erroneously presented parameter estimates from the 1,000-word experiment, while claiming in the text that they were from the 9,000-word experiment. The parameter estimates for these two experiments differ considerably because of the restriction of the training corpus in the 1,000-word experiment to short, covered sentences. For example, the probability that hear is translated as bravo

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

بهبود و توسعه یک سیستم مترجم‌یار انگلیسی به فارسی

In recent years, significant improvements have been achieved in statistical machine translation (SMT), but still even the best machine translation technology is far from replacing or even competing with human translators. Another way to increase the productivity of the translation process is computer-assisted translation (CAT) system. In a CAT system, the human translator begins to type the tra...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1991

Erratum To 'A Statistical Approach To Machine Translation'

نویسندگان

چکیده

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

A Hybrid Machine Translation System Based on a Monotone Decoder

بهبود و توسعه یک سیستم مترجم‌یار انگلیسی به فارسی

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

A Comparative Study of English-Persian Translation of Neural Google Translation

عنوان ژورنال:

اشتراک گذاری